Update guidelines on using GenAI by Mariatta · Pull Request #1778 · python/devguide

Mariatta · 2026-04-08T23:48:46Z

No description provided.

read-the-docs-community · 2026-04-08T23:49:32Z

Documentation build overview

📚 CPython devguide | 🛠️ Build #32787555 | 📁 Comparing cc0ff01 against latest (f890063)

🔍 Preview build

7 files changed · + 2 added · ± 3 modified · - 2 deleted

+ Added

± Modified

- Deleted

JacobCoffee

+1

Co-authored-by: Gregory P. Smith <greg@krypto.org> Co-authored-by: Donghee Na <donghee.na@python.org> Co-authored-by: devdanzin <74280297+devdanzin@users.noreply.github.com>

Co-authored-by: Jacob Coffee <jacob@z7x.org>

…delines. Add the Guidelines to the contributing table.

savannahostrowski

Thank you for doing this, @Mariatta!

My comments are mainly about extending the guidance to cover issues as well. While AI tooling can be great at surfacing real bugs and security issues, I think it's still important that those filing issues understand the problem themselves so we can keep discussions focused and productive.

tim-one · 2026-04-10T23:47:44Z

I would like text added to emphasize the dangers of AI assistants including work derived from training data, potentially violating the originals' copyrights and/or liecnsing terms. Core devs don't need that pointed out, but we have contributors of many backgrounds and experience levels. They're responsible for ensuring they have the legal right to grant the PSF permission to re-license their contributions, but explicit is better than implicit. Let's not assume "everyone knows" - everyone doesn't.

gpshead · 2026-04-11T02:05:57Z

I would like text added to emphasize the dangers of AI assistants including work derived from training data, potentially violating the originals' copyrights and/or liecnsing terms.

Absolutely not. Such words are reactionary made up non-specific dangers with nothing concrete to back them up. Thus they have no place in the Python devguide or policies because they are not actionable.

Contributor guidelines, the CLA, and license terms have already long covered this from a policy point of view.

tim-one · 2026-04-11T03:25:21Z

There are many examples from researchers of AI assistants duplicating training data verbatim, without attribution. blatantly violating copyright. How much more specific could it be?

Newer users in particular are easily bamboozled by this, unware of the issues, and seduced by the supremely confident tone AI assistants adopt. I'm concerned about them and the project. The CLA doesn't even explicitly ask contributors to attest they have a legal right to license their contributions - that's all hiding behind the single word of legalese "valid".

We haven't "long covered" this, because the intensified dangers of AI-produced code are a new development.

A few years back, a new contributor opened a PR with code copied verbatim from glibc. How did we catch it? Dead easy: a comment in the code plainly said what followed was copied from glibc. They simply didn't know any better at the time. BTW, they want on to become a core dev.

And they knew they were copying. How much more likely is someone to unwittingly contribute work that was copied by their AI assistant? How would they know? How would we?

I'm not claiming we can "fix this". We can't. But we can - and in IMO should - alert contributors that the risks of contributing derivative works are surely intensified by the use of AI assistants. Not to dissuade them, but to help inform their decisions. Not a change in policy, but pro-active education.

You may as well argue that all cautions about AI-produced code are redundant. For example, why encourage people to "Keep backwards compatibility with prior releases in mind"? That's always been policy too,

tim-one · 2026-04-11T05:36:53Z

BTW, Copilot assures me that provenance issues are the greatest danger projects face from use of AI tools. Being silent about that seems quite ill-advised.

But it also tells me that few cases of AI-enabled copyright/licensing violations get any publicity. Organizations want to keep them quiet, and contributors who unwittingly submit tainted code are hardly likely to publicize it either.

The chardet case is wildly atypical in every respect.

gpshead · 2026-04-11T08:10:28Z

Tim, I realize my response came off harsh. Sorry! I do care about this, I just want to keep this doc focused on actionable guidance for CPython contributions rather than general AI education, which it'll always be behind on.

The reason I proposed a backwards-compat reminder but am pushing back on this one: backwards compatibility is something the core team actively evaluates on most every PR. Contributors often get it wrong, and it's a concrete thing they can guide their model to keep in mind. It's a problem we actually see, so nudging AI-using contributors to be proactive about it could have a clear payoff.

A provenance warning doesn't have the same shape. We don't have a pattern of AI-laundered copyrighted code showing up in CPython PRs, and even if a contributor reads the warning and takes it seriously, what are they supposed to do? There's no reasonable verification step we can ask of them, and none we can perform either. A caution with no corresponding action just creates unease, and I don't think that earns space in these guidelines.

The licensing obligation itself is real and already lives in the CLA. I'm also not well placed to debate licensing specifics in a public thread, so I'll leave that side of it alone. If the concern is that newer contributors don't understand what they're agreeing to in the CLA, that's worth raising with the PSF as a CLA question rather than something we patch with an AI-specific note here.

My backwards-compat suggestion doesn't have to go in either, FWIW; that's Mariatta's and the docs reviewers' call. But I'd be sad to see a provenance warning land, as I think it'd detract from what's otherwise shaping up to be a refreshingly practical AI guidelines doc.

For this PR I suggest we proceed with the other reviews and table the provenance discussion. It isn't something this PR can resolve.

tim-one · 2026-04-11T19:15:37Z

Ya, I don't have easy answers here. It's just weird to me that our "AI policy" would take pains to point out potential problems in the context of AI that actually apply to all contributions, regardless of source. Whether it's the need for good tests, minimal disruption, or backward compatibility. Yet omit the one area (provenance) in which AI assistants are known to intensify risk, and is the area in which multiple lawsuits are currently active. Not yet iovolving the PSF, but the outrageous chardet case is adjacent to the Python ecosystem. Just a matter of time.

In the absence of will to address these issues directly, as best we can (yes, contributors are responsible, but no, we have no concrete suggestions for how they can be sure they have the legal right to license their contributions - which has always been true, but "explicit is better then implicit"), I think it better to say less rather than more. Point to what we expect of all contributions, adding no more here than that those criteria also apply to work of AI origin, in which cases the risks may be especially high.

While it's likely too on-target to fit with the devguide's style 😉, I like what Copilot suggested to me: "think of an AI assistent as a junior colleague with a photographic memory but no common sense" 😄

encukou · 2026-04-13T09:03:33Z

The licensing obligation itself is real and already lives in the CLA. [...] If the concern is that newer contributors don't understand what they're agreeing to in the CLA, that's worth raising with the PSF as a CLA question rather than something we patch with an AI-specific note here.

IMO, the devguide should, in general, provide commentary/implications/warnings/rules of thumb for more succinctly/formally/technically worded documents. This is the place.

willingc

Thank you @Mariatta for these updates. I have one strong ask for any documentation about generative AI, LLMs or other AI tools. I want to avoid personifying the tool or equating their ability and judgement to be the same as a human. Like all of our development tools, these tools, while capable in some situations, are still tools that require guidance and verification by humans.

willingc · 2026-04-15T22:57:46Z

@tim-one @gpshead You make excellent points. There is much gray area and uncharted waters around privacy, security, and copyright.

I would like to strongly word in this section that the LLMs/AI tools are software development tools not humans. An individual human is responsible for PR and issue submission as well as the quality and security of any code submitted. While AI tools may be used to assist the individual, the individual is still responsible for their submissions.

willingc

A good point was made in the discussion that "Generative AI" is too narrow a term to encompass the current "AI tool" landscape. When I originally wrote this back in October 2024 (admittedly as a response to a rise in slop at the time), the AI tooling landscape was dominated by Generative AI. We've now moved beyond that, so using the more general term "AI tools" makes good sense.

As to personifying tools, I recognize that we may refer to tools as "their" or "they". What I want to see is that we treat these AI tools, no matter how good they may be, as tools with benefits and limitations. Much like in carpentry, use the right tool for the job. In some cases, that may be corporate AI tool offerings, and in other cases, an open local LLM may be sufficient or preferable.

The most important point to stress in the dev guide is that the person submitting an issue or PR is responsible for the content.

tim-one · 2026-04-16T17:12:23Z

@willingc. yes, personal responsibility can't be overstressed. But, as you say, AI assistants are just tools. I have yet to see one fall into errors I haven't seen people fall into too. Most of what we're saying applies to all contributions.

Is there anything unique to AI-produced content apart from that we encourage disclosure of AI origin? Not really. The same standards apply regardless of origin.

What is unique is that AI-produced content is not the contributor's original work, nor of any other human. It's on the contributor to review it with great care to ensure it meet the project's standards.

Part of which is certainly thinking about possible copyright/licensing breaches. Ignoring that alone is bizarre. All we can ask of contributors in any of these areas is "best effort". We can't specify steps for them to follow to guarantee, e.g., "backward compatibility" either, but it's still the contributor's responsibility to review that as best they can - AI origin or not.

Saying nothing about provenance issues seems to me dodging our responsibility, to spell things out as best we can. No, we can't draw a "bright line" for them. But we know it's of special concern for AI content - so it's on us to ensure contributors are aware of that too.

Core devs are already aware of that. They're not my focus here. It's for the benefit of the less experienced, to whom it may well be news.

willingc · 2026-04-16T17:26:00Z

Saying nothing about provenance issues seems to me dodging our responsibility, to spell things out as best we can.

@tim-one Personally, I agree with you. Is the dev guide the place to share that? I don't know.

I don't think we can come up with a policy that everyone agrees upon. There's so much grey area and different international perspectives: https://guides.library.georgetown.edu/ai/copyright

I appreciate raising the issues about copyright, privacy, security to contributors. Perhaps an unbiased, if there is one, resource that we can link out to in a info block may be the best approach.

In research, we are starting to see things like: https://aidframework.org/

Note: If anyone wants my current perspective in more detail, this recent talk for a JetBrains event gives a pretty good overview: https://www.youtube.com/live/qKkyBhXIJJU?si=EMj6QsvEpj3UPmVq&t=24050 Slides: https://speakerdeck.com/willingc/conversation-computation-and-community-solving-scientific-problems-with-jupyter-notebooks-and-ai-tools

tim-one · 2026-04-16T19:10:06Z

Well, the devguide is certainly the place to spell out standards and guidelines for "good PRs". To me, AI is largely orthogonal to that, which is why I prefer saying less to saying more. The more we can point to criteria that apply to all contributions, the better.

AI use is then just a matter of encouraging disclosure, and spelling out that because such content was not produced by a human, "best effort" review to ensure it meets project standards requires extra care from the contributor. Tools have limitations too, and their limitations can have adverse consequences. That much is worth spelling out.

I do not want to see us presume to take stands on an any particular pointe of law, or even to sketch them. That's a fast-moving target, and beyond our competence anyway. Being responsible doesn't require certainty, neither from contributors nor from us. "Best efforts". on all sides.

Neither do I want to see us give an "AI tutorial" here. Your talk is an excellent introduction to the potential of these tools as well as their limitations, and a balanced account needs both. Like you, I don't value these tools so much for what they say as for their increasingly good emulation of an intelligent conversation, give and take, ego-free brainstorming, mistakes on both sides but with no hard feelings ever following. But then I never ask them to write mountains of code anyway.

willingc · 2026-04-16T19:19:12Z

Well, the devguide is certainly the place to spell out standards and guidelines for "good PRs". To me, AI is largely orthogonal to that, which is why I prefer saying less to saying more. The more we can point to criteria that apply to all contributions, the better.

100%

Co-authored-by: Savannah Ostrowski <savannah@python.org> Co-authored-by: Carol Willing <carolcode@willingconsulting.com>

…r pr is responsible for its content.

Mariatta · 2026-05-19T19:29:26Z

Thanks for all the feedback. I finally have some time to apply the suggestions. Please re-review. @willingc @gpshead
At this time I don't see consensus about addressing the copyright/legal issues mentioned by @tim-one. Perhaps it should be discussed as a separate issue, or on discourse?

…ut always.

willingc

Direct and to the point. I like it @Mariatta. Thanks to all for the discussion.

willingc · 2026-05-20T12:38:34Z

🚢

diegorusso · 2026-05-20T12:39:58Z

Very nit comment. Instead of calling the file generative-ai.rst, can we rename it to ai-tools.rst?

hugovk · 2026-05-21T04:13:40Z

    "contrib/project/governance.rst": "index.rst",
    "contrib/project/roles.rst": "index.rst",
-    "contrib/project/generative-ai.rst": "getting-started/generative-ai.rst",
+    "contrib/project/ai-tools.rst": "getting-started/ai-tools.rst",


This redirect isn't quite working.

We need to keep the old one working, and pointing to the new page.

https://devguide.python.org/contrib/project/generative-ai

And add a new redirect from the page that existed yesterday to the new one.

https://devguide.python.org/getting-started/generative-ai/

hugovk · 2026-05-21T04:14:12Z

@@ -1,40 +0,0 @@
-.. _generative-ai:


Please also retain this reference in the new page.

@hugovk Thanks. I opened a new PR. #1806

This imports the AI tools guidelines from the CPython devguide (python/devguide#1778) verbatim, as a starting point for our own policy. The text is licensed under CC0-1.0. Source: https://github.com/python/devguide/blob/main/getting-started/ai-tools.rst

JacobCoffee reviewed Apr 8, 2026

View reviewed changes

Comment thread getting-started/generative-ai.rst Outdated

Update guidelines on using GenAI

726ec3c

Co-authored-by: Gregory P. Smith <greg@krypto.org> Co-authored-by: Donghee Na <donghee.na@python.org> Co-authored-by: devdanzin <74280297+devdanzin@users.noreply.github.com>

Mariatta force-pushed the update-ai-contrib-guide branch from d2de5dc to 726ec3c Compare April 9, 2026 00:04

Mariatta and others added 5 commits April 8, 2026 17:05

Update getting-started/generative-ai.rst

e71105b

Co-authored-by: Jacob Coffee <jacob@z7x.org>

Update the title to clarify the purpose of this doc is to provide gui…

f8ab6ab

…delines. Add the Guidelines to the contributing table.

Fix the markup in rendering the table

a04cd52

Soften the wording about AI tool disclosure.

ef31ff8

Moved the considerations to be earlier in the page.

ff94cd6

savannahostrowski reviewed Apr 9, 2026

View reviewed changes

Comment thread getting-started/generative-ai.rst Outdated

Comment thread getting-started/generative-ai.rst Outdated

Comment thread getting-started/ai-tools.rst Outdated

StanFromIreland reviewed Apr 9, 2026

View reviewed changes

Comment thread getting-started/generative-ai.rst Outdated

StanFromIreland mentioned this pull request Apr 9, 2026

Suggested elaboration of AI policies #1777

Closed

gpshead reviewed Apr 9, 2026

View reviewed changes

Comment thread getting-started/generative-ai.rst Outdated

Comment thread getting-started/ai-tools.rst

willingc reviewed Apr 15, 2026

View reviewed changes

Comment thread getting-started/generative-ai.rst Outdated

Comment thread getting-started/generative-ai.rst Outdated

Comment thread getting-started/generative-ai.rst Outdated

Comment thread getting-started/ai-tools.rst

Comment thread getting-started/generative-ai.rst Outdated

hugovk reviewed Apr 16, 2026

View reviewed changes

Comment thread getting-started/generative-ai.rst Outdated

willingc reviewed Apr 16, 2026

View reviewed changes

Comment thread getting-started/generative-ai.rst Outdated

Comment thread getting-started/generative-ai.rst Outdated

Comment thread getting-started/generative-ai.rst Outdated

willingc reviewed Apr 16, 2026

View reviewed changes

Comment thread getting-started/generative-ai.rst Outdated

willingc reviewed Apr 16, 2026

View reviewed changes

Comment thread getting-started/generative-ai.rst Outdated

willingc mentioned this pull request May 5, 2026

proposal: llm contribution policy jupyterhub/team-compass#880

Merged

Mariatta and others added 5 commits May 18, 2026 17:53

Apply suggestions from code review

c827f75

Co-authored-by: Savannah Ostrowski <savannah@python.org> Co-authored-by: Carol Willing <carolcode@willingconsulting.com>

Update based on comments.

0406315

remove extra header lines

4092cf2

Add more explicit in the beginning that the person submitting issue o…

84f3cbd

…r pr is responsible for its content.

Merge branch 'main' into update-ai-contrib-guide

6130994

Mariatta requested review from JacobCoffee, StanFromIreland, gpshead, hauntsaninja, hugovk, savannahostrowski and willingc May 19, 2026 19:39

cmaloney reviewed May 19, 2026

View reviewed changes

Comment thread getting-started/generative-ai.rst Outdated

adjust wording about being disrespectful and about reviewing the outp…

2aad560

…ut always.

cmaloney approved these changes May 20, 2026

View reviewed changes

gpshead approved these changes May 20, 2026

View reviewed changes

willingc approved these changes May 20, 2026

View reviewed changes

Mariatta force-pushed the update-ai-contrib-guide branch from 6401656 to 20a5b6f Compare May 21, 2026 02:57

Rename the file to ai-tools.rst

52757df

Mariatta force-pushed the update-ai-contrib-guide branch from 20a5b6f to 52757df Compare May 21, 2026 03:07

Reformat to 80 chars max

cc0ff01

Mariatta merged commit 0ee1a98 into python:main May 21, 2026
5 checks passed

hugovk reviewed May 21, 2026

View reviewed changes

Turbo87 mentioned this pull request May 21, 2026

Add AI tools policy rust-lang/crates.io#13726

Merged

Uh oh!

Uh oh!

Conversation

Mariatta commented Apr 8, 2026

Uh oh!

read-the-docs-community Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Documentation build overview

Uh oh!

JacobCoffee left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

savannahostrowski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tim-one commented Apr 10, 2026

Uh oh!

gpshead commented Apr 11, 2026

Uh oh!

tim-one commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tim-one commented Apr 11, 2026

Uh oh!

gpshead commented Apr 11, 2026

Uh oh!

tim-one commented Apr 11, 2026

Uh oh!

encukou commented Apr 13, 2026

Uh oh!

willingc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

willingc commented Apr 15, 2026

Uh oh!

Uh oh!

willingc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tim-one commented Apr 16, 2026

Uh oh!

willingc commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tim-one commented Apr 16, 2026

Uh oh!

willingc commented Apr 16, 2026

Uh oh!

Uh oh!

Uh oh!

Mariatta commented May 19, 2026

Uh oh!

Uh oh!

willingc left a comment

Choose a reason for hiding this comment

Uh oh!

willingc commented May 20, 2026

Uh oh!

diegorusso commented May 20, 2026

Uh oh!

Uh oh!

hugovk May 21, 2026

Choose a reason for hiding this comment

Uh oh!

hugovk May 21, 2026

read-the-docs-community Bot commented Apr 8, 2026 •

edited

Loading

tim-one commented Apr 11, 2026 •

edited

Loading

willingc commented Apr 16, 2026 •

edited

Loading